11 research outputs found

    Verification of Authenticity of Stamps in Documents

    Get PDF
    Klasická inkoustová razítka, která se používají k autorizaci dokumentů, se dnes díky rozšíření moderních technologií dají relativně snadno padělat metodou oskenování a vytištění. V rámci diplomové práce je vyvíjen automatický nástroj pro ověření pravosti razítek, který najde využití zejména v prostředích, kde je nutné zpracovávat velké množství dokumentů. Procesu ověření pravosti razítka musí přirozeně předcházet jeho detekce v dokumentu - úloha zpracování obrazu, která zatím nemá přesvědčivé řešení. V této diplomové práci je navržena zcela nová metoda detekce a ověření pravosti razítka v barevných obrazech dokumentů. Tato metoda zahrnuje plnou segmentaci stránky za účelem určení kandidátních řešení, dále extrakci příznaků a následnou klasifikaci kandidátů za pomoci algoritmu podpůrných vektorů (SVM). Evaluace ukázala, že algoritmus umožňuje rozlišovat razítka od jiných barevných objektů v dokumentu jako jsou například loga a barevné nápisy. Kromě toho algoritmus dokáže rozlišit pravá razítka od kopií.Classical ink stamps and seals used for authentication of a document content have become relatively easy to forge by the scan & print technique since the technology is available to general public. For environments where a huge volume of documents is processed, an automatic system for verification of authenticity of stamps is being developed in the scope of this master's thesis. The process of stamp authenticity verification naturally must be preceded by the phase of stamp detection and segmentation - a difficult task of Document Image Analysis (DIA). In this master's thesis, a novel method for detection and verification of stamps in color document images is proposed. It involves a full segmentation of the page to identify candidate solutions, extraction of features, and further classification of the candidates by means of support vector machines. The evaluation has shown that the algorithm is capable of differentiating stamps from other color objects in the document such as logos or text and also genuine stamps from copied ones.

    Simulation of Finite Transducers

    No full text
    K rychlému překladu mezi strojovým kódem a assemblerem za účelem simulace je možné použít speciální abstraktní model - tzv. párový konečný automat. Jeho vnitřní uspořádání nás přivádí k problematice konečných převodníků. Vzhledem k tomu, že simulace deterministických převodníků je efektivnější, musíme se procesem determinizace zabývat. Existující algoritmy jsou bohužel aplikovatelné pouze na převodníky provádějící překlad konečných jazyků, zatímco my na vstupu očekáváme obecně nekonečný jazyk. Proto je nutné nalézt způsob, jak rychle rozpoznat, je-li převodník na vstupu determinizovatelný. V této bakalářské práci jsou shrnuty doposud publikované poznatky z oblasti determinizace konečných převodníků a rovněž navržen nový algoritmus determinizace převodníků provádějících překlad obecně nekonečných jazyků. Nedeterminizovatelné převodníky na vstupu jsou detekovány.A quick translation between binary code and assembler for the purpose of simulation can be done by a special abstract model - a two-way coupled finite automaton. Its inner representation brings us to the question of finite transducers. As we know that simulation of deterministic transducers is more efficient, we have to concern ourselves with that process. Unfortunately, the existing algorithms are applicable just for transducers translating finite languages while we expect generally infinite language on input. Therefore it is important to find a way how to quickly detect the determinizability of the input transducer. In this bachelor's thesis, so far published works on the determinizability of finite transducers are brought together and a new algorithm of determinization of transducers translating generally infinite languages is presented. Undeterminizable transducers on input are detected.

    On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study

    No full text
    The evaluation of unsupervised outlier detection algorithms is a constant challenge in data mining research. Little is known regarding the strengths and weaknesses of different standard outlier detection models, and the impact of parameter choices for these algorithms. The scarcity of appropriate benchmark datasets with ground truth annotation is a significant impediment to the evaluation of outlier methods. Even when labeled datasets are available, their suitability for the outlier detection task is typically unknown. Furthermore, the biases of commonly-used evaluation measures are not fully understood. It is thus difficult to ascertain the extent to which newly-proposed outlier detection methods improve over established methods. In this paper, we perform an extensive experimental study on the performance of a representative set of standard k nearest neighborhood-based methods for unsupervised outlier detection, across a wide variety of datasets prepared for this purpose. Based on the overall performance of the outlier detection methods, we provide a characterization of the datasets themselves, and discuss their suitability as outlier detection benchmark sets. We also examine the most commonly-used measures for comparing the performance of different methods, and suggest adaptations that are more suitable for the evaluation of outlier detection results

    Colorectal tumour mucosa microbiome is enriched in oral pathogens and defines three subtypes that correlate with markers of tumour progression

    Full text link
    Long-term dysbiosis of the gut microbiome has a significant impact on colorectal cancer (CRC) progression and explains part of the observed heterogeneity of the disease. Even though the shifts in gut microbiome in the normal-adenoma-carcinoma sequence were described, the landscape of the microbiome within CRC and its associations with clinical variables remain under-explored. We performed 16S rRNA gene sequencing of paired tumour tissue, adjacent visually normal mucosa and stool swabs of 178 patients with stage 0–IV CRC to describe the tumour microbiome and its association with clinical variables. We identified new genera associated either with CRC tumour mucosa or CRC in general. The tumour mucosa was dominated by genera belonging to oral patho-gens. Based on the tumour microbiome, we stratified CRC patients into three subtypes, significantly associated with prognostic factors such as tumour grade, sidedness and TNM staging, BRAF mutation and MSI status. We found that the CRC microbiome is strongly correlated with the grade, location and stage, but these associations are dependent on the microbial environment. Our study opens new research avenues in the microbiome CRC biomarker detection of disease progression while identifying its limitations, suggesting the need for combining several sampling sites (e.g., stool and tumour swabs)
    corecore